Generating Gaussian Mixture Models by Model Selection For Speech Recognition

نویسنده

  • Kai Yu
چکیده

While all modern speech recognition systems use Gaussian mixture models, there is no standard method to determine the number of mixture components. Current choices for mixture component numbers are usually arbitrary with little justification. In this paper we apply some common model selection methods to determine the number of mixture components. We show that they are ill-suited for the speech recognition task because increasing test set data likelihood does not necessarily improve speech recognition performance. We then present a model selection criterion modified to better suit speech recognition, and its positive and negative effects on speech recognition performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Recognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model

Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....

متن کامل

A Comparative Study of Gender and Age Classification in Speech Signals

Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...

متن کامل

Speech Enhancement using Laplacian Mixture Model under Signal Presence Uncertainty

In this paper an estimator for speech enhancement based on Laplacian Mixture Model has been proposed. The proposed method, estimates the complex DFT coefficients of clean speech from noisy speech using the MMSE  estimator, when the clean speech DFT coefficients are supposed mixture of Laplacians and the DFT coefficients of  noise are assumed zero-mean Gaussian distribution. Furthermore, the MMS...

متن کامل

Generating small, accurate acoustic models with a modified Bayesian information criterion

Although Gaussian mixture models are commonly used in acoustic models for speech recognition, there is no standard method for determining the number of mixture components. Most models arbitrarily assign the number of mixture components with little justification. While model selection techniques with a mathematical derivation, such as the Bayesian information criterion (BIC), have been applied, ...

متن کامل

Variational Bayesian GMM for speech recognition

In this paper, we explore the potentialities of Variational Bayesian (VB) learning for speech recognition problems. VB methods deal in a more rigorous way with model selection and are a generalization of MAP learning. VB training for Gaussian Mixture Models is less affected than EM-ML training by overfitting and singular solutions. We compare two types of Variational Bayesian Gaussian Mixture M...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006